Embedding Knowledge in Web Documents: CGs versus XML-based Metadata Languages
نویسندگان
چکیده
The paper argues for the use of general and intuitive knowledge representation languages for indexing the content of Web documents and representing knowledge within them. We believe these languages have advantages over metadata languages based on the Extensible Markup Language (XML). Indeed, the representation and retrieval of precise information is better supported by languages designed to represent semantic content and support logical inference, and the readability of such a language eases its exploitation, presentation and direct insertion within a document.To further ease the representation process, we propose techniques allowing users to leave some knowledge terms un-declared. We illustrate these ideas with WebKB 1 , a precision-oriented information retrieval/annotation tool, and show how lexical, structural and knowledge-based techniques may be combined to retrieve or generate knowledge or Web documents. Finally, to overcome the scalability problems of storing knowledge within Web documents, we propose some ideas for scalable and cooperatively built knowledge repositories.
منابع مشابه
Knowledge Retrieval and the World Wide Web
L ARGE-SCALE WEB SEARCH engines effectively retrieve entire documents, but they are imprecise, because they do not exploit and hence retrieve the semantic Web document content. We cannot automatically extract such content from general documents yet. Manually structuring Web documents— for example, with XML—lets us retrieve more precise information using stringand structure-matching tools, such ...
متن کاملEmbedding Knowledge in Web Documents
The paper argues for the use of general and intuitive knowledge representation languages (and simpler notational variants, e.g. subsets of natural languages) for indexing the content of Web documents and representing knowledge within them. We believe that these languages have advantages over metadata languages based on the Extensible Mark-up Language (XML). Indeed, the retrieval of precise info...
متن کاملValid versus Meaningful: Raising the Level of Semantic Validation
”Traditional” schema languages for XML such as XML Schema or Relax NG are used to validate documents and ensure that they are syntactically correct. These schema languages however lack the expressive power and diagnostic capabilities to provide ”semantic validation”. We illustrate the need for such validation by examples taken from the Financial Products Markup Language (FpML) and XML Metadata ...
متن کاملFrom Xml to Rdf: Syntax, Semantics, Security, and Integrity
In this paper we evaluate security methods for eXtensible Markup Language (XML) and the Resource Description Framework (RDF). We argue that existing models are insufficient to provide high assurance security for future Web-based applications. We begin with a brief overview of XML access control models, where the protection objects are identified by the XML syntax. We show, that these approaches...
متن کاملA Web-Based Metadata Schema Repository
The metadata schema of a digital archive describes the structure and attributes of metadata. Analysis and definition of metadata schema for a new digital archive must be carefully carried out and determined at the first stage of development. To ease the task, we used an Extensible Markup Language (XML) structure to represent the metadata schema, and then designed and implemented a metadata sche...
متن کامل